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The existent document is input to an input part 1 and the input 
document is divided into morphemes by a morpheme analytic part 2. Based 
on the results divided by the morpheme analytic part 2, the content of the 
existent dictionary 4 is adjusted in a dictionary adjustment part 3 so as to be 

20 adapted for the purpose of use. 

The following is an explanation based on a specific example of the 
input sentence. For example, when 'A tooth aches me very much.' is input as 
the existent document to the input part 1, this input sentence is transmitted 
to the morpheme analytic part 2. The morpheme analytic part 2 divides the 

25 input sentence into 'A tooth / aches / me / very much.' Firstly, the morpheme 
list is retrieved as shown in Figure 3, thereby it is understood that the input 
sentence includes the following candidates of the combination of notation / 
part of speech (referred to as morpheme). 

30 Table 1 



t [to] 


case particle 


T [te] 


connective particle 


h [mo] 


adverbial postpositional particle 


tXi> [to-te-mo] 


Adverb 


t?L# [mushiba] 


Noun 


a* [ga] 


case particle 


ffitr [i-ta-mu] 


ma-row five-level conjugating type verb 
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At this time, only the word stem parts of the conjugating type words 
are registered in the morpheme list while ending parts are separately 
provided for each part of speech. When matching with the input, the 
5 concatenation of the word stem parts and the ending parts may be subjected 
to matching, thereby realizing the compression of the morpheme list and 
improving the retrieving efficiency. 

Next, based on the table of the concatenation shown in Figure 4, the 
concatenation possibilities of all the morphemes are checked. In this case, 

10 with respect to only 'to-te-mo' (adverb), 'mu-shrba' (noun), *ga' (case particle) 
and 'i-ta-mu' (ma— row five— level conjugating type verb), the concatenation is 
possible. If there are plural concatenation abilities, the most suitable 
concatenation covering the input is selected by utilizing information on 
easiness of concatenation between parts of speech, morpheme length, the 

15 kind of characters, the number of all clauses, the number of morphemes in 
the clauses, etc. 

Thus, the results obtained by converting the input into rows of 
morphemes are transmitted to the dictionary adjustment part 3. The 
dictionary adjustment part 3 retrieves the corresponding entry in the existent 

20 dictionary 4 as shown in Figure 5 by utilizing the above-mentioned 
information for each morpheme. At the time of kana (Japanese 
syllabary)-kanji conversion, optional information used for selecting the entry 
(the entry is not necessarily limited to kanji and also includes hiragana such 
as particles, auxiliary verb, etc.) is adjusted so that the best conversion rate is 

25 obtained. For example, the frequency of use, the degree of concatenation, or 
the like, are increased by the unit amount. 
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Figure 4 





noun 


Proper 
noun 


Irregular 
noun 




Noun 


o 


O 


O 




Proper noun 


o 


O 


X 




Personal noun 


o 


o 


X 




Irregular conjugating noun of 
sa-row 


o 


o 


O 















In Figure 4, O means that the concatenation between the preceding part of 
5 speech and the following part of speech is possible; and X means that the 
concatenation between the preceding part of speech and the following part of 
speech is not possible. 



Figure 5 
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Notation 


Part of speech 


Reading 


Frequency of use 












Noun 


mushiba 


20 




Ba— row five— level 
conjugating verb 


musubu 
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(57) Abstract: 

PURPOSE: To perform a process in consideration of the 
unity of words by providing a dictionary adjustment part 
which adjusts the contents of a dictionary to the best 
contents for the purpose of its use according to the 
result of the division of an existent document which is 
inputted into morphemes. 

CONSTITUTION: The existent document is inputted to an 
input part 1 and divided by a morpheme analytic part 2 
into morphemes. For example, when *A tooth aches me 
very much.' is inputted as the existent to the input 
part 1, the morpheme analytic part 2 divides the input 
sentence into 'A tooth/aches/me/very much.' Then the 
dictionary adjustment part 3 retrieves the corresponding 
entry in an existent dictionary 4 by utilizing 
information on easiness of concatenation between parts 
of speech, morpheme length, the kind of characters, the 
number of all clauses, the number of morphemes in the 
clauses, etc., by the morphemes and adjusts optional 
information which is used to select the entry at the 
time of KANA(Japanese syllabary)-KANJI(Chinese 
character) conversion so that the best conversion rate 



is obtained. Consequently, the utilization efficiency of 
the existent document is improved. 
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